Adaptive neuro-fuzzy modeling applied to policy gradient reinforcement learning
نویسندگان
چکیده
-Function approximation has been used extensively with rein forcement learning, even though theoretical support was based mainly on tabular representations. This paper proposes an actor-critic structure following the existing convergence proofs as much as possible. The actor and critic modules employ an adaptive neuro-fuzzy architecture based on fuzzy ARTMAP concepts and gradient descent. Results on the well-known mountain car task indicate the viability of this approach, which is probably the first one using a selfgrowing structure in this kind of tasks.
منابع مشابه
Adaptive swarm behavior acquisition by a neuro-fuzzy system and reinforcement learning algorithm
Purpose – A neuro-fuzzy system with a reinforcement learning algorithm (RL) for adaptive swarm behaviors acquisition is presented. The basic idea is that each individual (agent) has the same internal model and the same learning procedure, and the adaptive behaviors are acquired only by the reward or punishment from the environment. The formation of the swarm is also designed by RL, e.g., TD-err...
متن کاملRepetitive Tracking Control of Nonlinear Systems Using Reinforcement Fuzzy-Neural Adaptive Iterative Learning Controller
This paper proposes a new fuzzy neural network based reinforcement adaptive iterative learning controller for a class of nonlinear systems. Different from some existing reinforcement learning schemes, the reinforcement adaptive iterative learning controller has the advantages of rigorous proofs without using an approximation of the plant Jacobian. The critic is appended into the reinforcement a...
متن کاملEvaluation of the Efficiency of the Adaptive Neuro Fuzzy Inference System (ANFIS) in the Modeling of the Ionosphere Total Electron Content Time Series Case Study: Tehran Permanent GPS Station
Global positioning system (GPS) measurements provide accurate and continuous 3-dimensional position, velocity and time data anywhere on or above the surface of the earth, anytime, and in all weather conditions. However, the predominant ranging error source for GPS signals is an ionospheric error. The ionosphere is the region of the atmosphere from about 60 km to more than 1500 km above the eart...
متن کاملA Distributed Adaptive Neuro-Fuzzy Network for Chaotic Time Series Prediction
In this paper a Distributed Adaptive Neuro-Fuzzy Architecture (DANFA) model with a second order Takagi-Sugeno inference mechanism is presented. The proposed approach is based on the simple idea to reduce the number of the fuzzy rules and the computational load, when modeling nonlinear systems. As a learning procedure for the designed structure a two-step gradient descent algorithm with a fixed ...
متن کاملA Self-Generating Neuro-Fuzzy System Through Reinforcements
In this paper, a novel self-generating neuro-fuzzy system through reinforcements is proposed. Not only the weights of the network but also the architecture of the whole network are all learned through reinforcement learning. The proposed neuro-fuzzy system is applied to the inverted pendulum system to demonstrate its performance. Key-words: reinforcement learning, neural network, neuro-fuzzy sy...
متن کامل